International Journal of Engineering and Technical Research (IJETR) 
ISSN: 2321-0869 (O) 2454-4698 (P) Volume-7, Issue-6, June 2017 


A Markov Chain Sampling Method for 
Conformational Energy Optimization of Peptides 

Jiann-Horng Lin 


Abstract — The Markov chain sampling is a powerful tool for 
the optimization of complicated objective functions. It is 
introduced in order to more efficiently search the domain of the 
objective function. In many applications these functions are 
deterministic and randomness. The maximum statistic converge 
to the maximum point of probability density which establishing 
links between the Markov chain sampling and optimization 
search. This statistical computation algorithm demonstrates 
convergence property of maximum statistics in large samples 
and it is global search design to avoid on local optimal solution 
restrictions. We have developed and implemented a Markov 
chain sampling to determine the best energy minimum for 
oligopeptides. Our test molecule was Met-enkephalin, a 
pentapeptide that over the years has been used as a validation 
model for many global optimizers. The test potential energy 
function was ECEPP/3. The results indicate that the proposed 
optimization search is an efficient algorithm for conformational 
searches. 

Index Terms — Markov chain, protein structure prediction, 
protein folding, conformational space search 

I. INTRODUCTION 

The computational identification of the low energy 
structures of a peptide from its sequence alone has been a 
problem of major interest for many years. It is not an easy 
task even for small peptides, due to the multiple-minima 
problem and combinatorial explosion. A number of 
conformational search algorithms have been developed in the 
past for this purpose. We have developed an algorithm that 
addresses this problem. Peptides are short polymers made up 
of a few to a few tens of amino acids. Many of these have 
meaningful roles in biochemistry and biophysics. Some 
sequences of peptides have a clear tendency to form 
well-defined three-dimensional structures, that is, to fold. 
Peptides are also useful as model systems for much larger 
peptide chains known as proteins. The naturally occurring 
three dimensional structure of a protein, its “tertiary 
structure,” is believed to be uniquely determined by its 
“primary structure,” the sequence of amino acids of which the 
protein is composed. Anfisen [1] in his “thermodynamic 
hypothesis” proposes that the native state of a protein is the 
structure that minimizes the free energy. By definition, such a 
state would be at the global minimum of free energy relative 
to all other states accessible on that time scale. Thus, the 
conformational search, or folding, can be posed as an 
optimization problem. Conformational search of peptide 
molecules, to a first approximation, can be thought of as the 
problem of finding the 3D molecular structure that 
corresponds to the lowest local minimum of an appropriate 
mathematical function describing the potential energy of the 
system. Computer simulations are often used to carry out this 
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task. A major concern in computer simulations is to obtain a 
set of low-energy conformations with biological significance; 
that is, finding those conformations that are near the 
thermodynamic native state. Folding a protein from only a 
knowledge of its amino acid sequence is a formidable task. 
Because it is computationally impossible to test all possible 
conformations to determine the global minimum, it is 
necessary to develop methods that can land upon a global 
minimum without testing all conformational possibilities. 
This is a challenging optimization (minimization) task. In 
many cases the detailed properties of the potential function to 
be minimized are not known. Even if the function is 
differentiable, one can often encounter non-convex surfaces, 
and the local properties of the function can be different in the 
different search regions, i.e., the basins can have different size 
or depth, the smoothness can vary, etc. Many different force 
fields for proteins have been designed as a summation of a set 
of potential energy contributions. Among the most used ones 
are: ECEPP [2], MM2[3], ECEPP/2 [4], CHARMM [5], 
DISCOVER [6], AMBER [7], GROMOS87 [8], MM3 [9], 
and ECEPP/3 [10]. Most of these have a large number of 
local minima. In general, protein folding with any force field 
is a NP-hard problem [11] where the time needed to locate the 
lowest minimum grows exponentially when the number of 
variables grows linearly. A major challenge in this type of 
global optimization problems is that there is no clear 
mathematical basis for efficiently reaching the global 
minimum, thus finding the latter in an accurate and speedy 
way is of general interest. To reduce the size of the problem 
one takes advantage of the fact that under biological 
conditions some internal motions of protein molecules occur 
on a time scale much smaller than others. Experimentally, the 
average values of covalent bond distances and covalent bond 
angles are fairly constant, and lead to the assumption that 
conformational changes observed in the dihedral angles could 
fully determine the overall shape of the protein molecule. 
Thus, if one specifies the position of all atoms in the protein 
molecule as a function of its internal coordinates, under the 
assumption of constant bond lengths and bond angles, the 
problem drastically reduces the number of its degrees of 
freedom. Although the size of the problem can be reduced 
when the energy function is written in terms of torsional 
angles, it is known that in this form the energy function is no 
longer partially separable, meaning that it is no longer much 
less expensive to reevaluate the energy if only a few variables 
change than if they all change. To overcome this effect, a 
number of workers have devised interesting stochastic and 
non-stochastic methods, which impose constrains and bias the 
search towards the region where the global minimum could be 
found. Among stochastic methods employed to predict 
oligopeptide 3D structures are Monte Carlo with 
minimization (MCM) [12a, 12b], simulated annealing (SA) 
[13], threshold accepting (TA) [14], free energy Monte Carlo 
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with minimization (FMCM) [15], multi-canonical ensemble 
(ME) [16], conformational space annealing (CSA) [17], and 
genetic algorithms 18 (GA) [18]. Among non-stochastic 
methods we find molecular dynamics with minimization 
(MDM) [19], dynamic programming (DP) [20], the diffusion 
equation method (DEM) [21a], the mean-field technique 
(MFT) [22], and a global optimization procedure known as 
_BB [23]. In this article we take an approach to minimize the 
ECEPP/3 [10] energy function based on tabu search (TS) 
[24], a stochastic optimizer developed to treat complex 
combinatorial optimization tasks. Our test molecule is that of 
Met-enkephalin, a pentapeptide that has been used as a 
validation model for many global optimizers, and because its 
lowest energy conformation for the potential energy function 
ECEPP/3 is known [23]. We first present the problem we are 
dealing with in a mathematical fashion, then we discuss the 
general principle of the proposed sampling algorithm and 
explain how to use this algorithm for conformational search. 
Finally, we present our computational results. 

Protein Conformational Search Problem 

As indicated above, the conformation of a protein with a 
sequence of N res amino acid residues in the peptide chain can 
be described by a set of dihedral angles (j) i , y/ i , CO i , where i 
= 1 ,...,N res on the backbone, plus a set of dihedral angles j , 
i = 1,..., N res , j = 1,..., J. , where J. denotes the dihedral 

angles of the side group on the i -th residue. If one wishes to 
allow capping of the peptide, then one has to include two 
more sets of dihedral angles. One could be defined as cp* ,k = 

1,..., K n for those dihedral angles on the amino end group, 
and the other could be defined as cp ^, k = for those 

dihedral angles on the carbonyl end group. In this paper the 
complete ECEPP/3 [10] force field was used. This force field 
is built upon the assumptions that the bond lengths and angles 
are at their equilibrium values, and that the resulting function 
is in reality a conformational energy surface made of a 
summation over interactions of types 1-4 and higher. These 
interactions take into account electrostatic, nonbonded, 
hydrogen bond, and torsional energies, plus other empirical 
terms that take into account a loop closing potential in the 
case that the peptide has intramolecular disulfide bonds, and 
fixed conformational energies for the propyl and 
hydroxypropyl residues. A condensed description of the 
ECEPP/3 force field could be written as: 

u = u elec + U nonb + u hb +u tor + u loop + u s _ s 

Where 
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All constants are estimated by fitting of experimental data 
[10]. Given these definitions, the potential energy 
minimization problem can be summarized as follows: 

minimize Ui^y/,, o) t , x!, (p k , <p k ) 
subject to the particular constrains: 

-18CP < $, y/i <+l8QP,i = l,...,N res 
-10° < («,. — 180°) < +10°,/ = 1,..., N res 
-180° < j/ < +180° ,i = 1,..., N„,j = 1,..., J‘ 

-1 80P < <p k , <p c k < +180°, k = 1,..., K N ,k= 1,..., K c 

The most important points in the implementation of a bee 
swarm optimization to our particular application are: the 
search space X, the cost function/ The cost function/(v) is 
the empirical energy function ECEPP/3, which is designed to 
work in angle space X , while keeping bond length and bond 
angle values constant, and where no solvent effects are 
included. 


II. Markov Chain Monte Carlo 

Markov chain Monte Carlo methods are a class of 
sample-generating techniques by controlling how a random 
walk behaves. It attempts to directly draw samples from some 
complex probability distribution based on constructing a 
Markov chain that has the desired distribution as its 
equilibrium distribution. The state of the chain after a large 
number of steps is then used as a sample of the desired 
distribution. The quality of the sample improves as a function 
of the number of steps. Usually it is not hard to construct a 
Markov chain with the desired properties. The more difficult 
problem is to determine how many steps are needed to 
converge to the stationary distribution within an acceptable 
error. The Markov chain Monte Carlo has become a powerful 
tool for Bayesian statistical analysis, Monte Carlo 
simulations, and potentially optimization with high 
nonlinearity. There are many ways to choose the transition 
probability, and different choices will result in different 
behaviour of Markov chain. In essence, the characteristics of 
the transition kernel largely determine how the Markov chain 
of interest behaves, which also determines the efficiency and 
convergence of Markov chain Monte Carlo sampling. There 
are several widely used sampling algorithms, such as 
Metropolis-Hasting Algorithm [25] and Gibbs Sampler [26]. 

A. Metropolis-Hastings Sampling Algorithm 

The basic idea of Markov chain Monte Carlo methods is to 
construct a Markov chain with the specified stationary 
distribution, namely n(6) , then run the chain with full length 
till the sample chain value close enough to its stationary 
distribution. Then take stationary chains as the samples of 
7 t( 0 ) and make variety of statistical inference based on these 

samples. The most popular Markov chain Monte Carlo 
sampling method is Metropolis-Hastings algorithm, which 
means sampling starts from another easily known reversible 
Markov chain Q , and obtain the new Markov chain by 
comparing. It generates a random walk using a proposal 
density and a method for rejecting proposed moves. 
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To draw samples from the target distribution, we let 
n{6) — P • p{6) , where p is a normalizing constant which 
is either difficult to estimate or not known. We will see later 
that the normalizing factor p disappear in the expression of 
acceptance probability. The Metropolis-Hastings algorithm 
essentially expresses an arbitrary transition probability from 
state 6 to (j) as the product of an arbitrary transition kernel 

q{0, (/)) and a probability a{6 , (f>) . That is, 

P(0, 4>) = P(0 = q{0, <t>)a{9, <f>) 

Here q is the proposal distribution function, while a(0,<p) 
can be considered as the acceptance rate from state 6 to (j) , 


and can be determined by 


a{6,(f)) = min 


n(0)q(<f>,0) 

7r(0)q(0,<f>) 


4 


l p(0)q(&,0) 

\p{0)q(0,(p) 


The essence of Metropolis-Hastings algorithm is to first 
propose a candidate 6 *, then accept it with probability a . 
That is, o if a > u where u is a random value drawn 


from an uniform distribution in [0, 1], otherwise Q <^0 r It 

is straightforward to verify that the reversibility condition is 

satisfied by the Metropolis-Hastings kernel 

q(0, <f))a(0, <t>)7i(0) = q((f>, 0)a(<fi, &)n{<jj ), 

for all 9,(j). Consequently, the Markov chain will converge 


to a stationary distribution which is the target distribution 
7l{0). 

In a special case, when the transition kernel is symmetric is its 
arguments, or q(0, (/)) — q(cj), 6) , for all 6 , (j) , then the 
acceptance rate a(0,<f>) become 

a{0 , (/)) = min] \ l, 

1 P(0) J 


And the Metropolis-Hastings algorithm reduces to the classic 
Metropolis algorithm. In this case, the associated Markov 
chain is called as symmetric chain. In a special case when 
a — 1 is used, that is the acceptance probability is always 1, 
then the Metropolis-Hastings degenerates into the classic 
widely used Gibbs sampling algorithm. However, Gibbs 
sampler becomes very inefficient for the distributions that are 
non-normally distributed or highly nonlinear. 


B. Random Walk and Levy Flight 

A random walk is a random process which consists of taking a 
series of consecutive random steps. The sum of each 
consecutive step which is a random step drawn from a random 
distribution forms a random walk. It means the next state will 
only depend on the current existing state and the transition 
from the existing state to the next state. This is typically the 
main property of a Markov chain. If the step size obeys the 
Gaussian distribution, the random walk becomes the 
Brownian motion. In theory, as the number of steps increases, 
the central limit theorem implies that the random walk should 
approaches a Gaussian distribution. If the step size obeys 
other distribution, we have to deal with more generalized 
random walk. A special case is when the step size obeys the 
Levy distribution, such a random walk is called a Levy flight 
or Levy walk. Levy flight is a random walk whose step length 
is drawn from the heavy-tailed Levy distribution often in 
terms of a simple power law formula. It is worth to point out 


that a power law distribution is often link to some scale free 
characteristics, and Levy flights can thus show self-similarity 
and fractal behaviour in the fight patterns. 

III. Markov Chain Sampling for Optimization Search 


A simple random walk can be considered as a Markov chain. 
In a probability distribution, the largest density area is mostly 
tending to be sampled. So the sampling density function 
should converge to the maximum point of maximum 
probability if the sample is sufficiently large. Thus 
establishing links between the function maximum value and 
sampling extreme statistics. We can use Markov chain Monte 
Carlo to simulate a sample of this distribution. And the 
optimal will appear most frequently in the sample. That is, 
the optimal state will have the greatest probability. 

Suppose that we are interested in exploring solutions X that 
minimize an objective function /(x) <= R , where 


x = (x lv .., x n ) eR n . That is, if we want to find the minimum 
of an objective function /(x)<=R at x = x* so that 
f* = f(x*)< f(x) • We can convert it to a target 
distribution for a Markov chain 

7r(x) oc 

where p > 0 is a parameter which act as a normalized factor. 
P should be chosen so that the probability is close to 1 when 
x —» x*. At X = x*, tc — n(x *) > 7T(x) . This often requires 
that the formulation of f(x) should be non-negative, which 
means that some objective functions can be shifted by a large 
constant C>0 if necessary. Then, a Markov chain is 
constructed to sample 7f(x) . Typically, the solutions in the 
vicinity of the global minimum of f(x) are most likely to be 
drawn in the sampling process. Therefore, Markov chain 
Monte Carlo can also be used for optimization purposes. To 
design a Markov chain with stationary distribution n{x) , the 
maximum point in finite sampling from distribution n{x) 
will be sufficiently close to the maximum point of /(x) in 
the feasible region. When the transition kernel is symmetric is 
its arguments, or q(y,x t ) = q(x t , y) , then the acceptance 
rate a(x t ,y) become 


a(x t ,y) = miiK 1, 


7r{x t )q(x t ,y) 


= min 


1, 


^•(y) j 

7T(x t )j 


The proposed Markov chain sampling for optimization search 
algorithm is: 


(1) Start with x 0 , at t = 0, X 0 = X 0 

(2) Propose a new solution y 

(3) Drawn u from the uniform distribution U ( 0 , 1 ) 

(4) Compute a(x t , y) = min] 1, 

[ n(x t ) 

[y u< a (with probability a) 

(5) Take x ?+1 = \ 

[x f u > a (with probability 1 - a) 


38 


www.erpublication.org 








A Markov Chain Sampling Method for Conformational Energy Optimization of Peptides 


f(x*)> f(x t+1 ) . 

f(x t )<f(x t+l ) 

Repeat (2) to (6). If the iteration times are large enough, then 

* 

X t will convergence to the maximum point of the objective 
function f(x) in distribution. We can see from the problem 
analysis above that the key points of Markov chain sampling 
method are designing of general probability density function 
7r(x) and uniform sampling from conditional constraint 
region. 

In order to solve an optimization problem, we can search the 
solution by performing a random walk starting from a good 
initial but random guess solution. However, to be 
computationally efficient and effective in searching for new 
solutions, we can keep the best solutions found so far, and to 
increase the mobility of the random walk so as to explore the 
search space more effectively. We can find a way to control 
the walk in such a way that it can move towards the optimal 
solutions more quickly, rather than wander away from the 
potential best solutions. These are the challenges for the most 
metaheuristic algorithms. The same issues are also important 
for Monte Carlo simulations and Markov chain sampling 
techniques. An important link between Markov chain and 
optimization is that some heuristic or metaheuristic search 
algorithms such as simulated annealing use a trajectory-based 
approach. They start with some initial random solution, and 
propose a new solution randomly. Then the move is accepted 
or not, depending on some probability. It is similar to a 
Markov chain. In fact, the standard simulated annealing is a 
random walk. Simulated annealing is a probabilistic method 
for finding global minimum of some cost function introduced 
by Kirkpatrick et al. [27]. It searches local minimum, and 
finally stays at the global minimum given enough time. This 
sampling method was originally extended from Metropolis 
Algorithm [28] by implanting a temperature function T. T is 
used to control the difficulty for the stochastic sampler to 
escape from a local minimum and reach the global optimal for 
a non-optimal state. Algorithms such as simulated annealing 
which use a single Markov chain may not be very efficient. In 
practice, it is usually advantageous to use multiple Markov 
chains in parallel to increase the overall efficiency. In fact, 
the algorithms such as particle swarm optimization can be 
viewed as multiple interacting Markov chains, though such 
theoretical analysis remains almost intractable. The theory of 
interacting Markov chains is complicated and yet still under 
development. However, any progress in such areas will play a 
central role in the understanding how population- and 
trajectory-based metaheuristic algorithms perform under 
various conditions. 


(6) Take x* +l = < 1 


being trapped into local optimum, chaotic sequence and a 
chaotic Levy flight can be incorporated in the meta-heuristic 
search for efficiently generating new solutions. In the paper 
[29], we presented synergistic strategies for meta-heuristic 
optimization learning, with an emphasis on the balance 
between intensification and diversification. We showed some 
promising efficiency for global optimization. Interestingly, it 
can be viewed to link with optimization search and Markov 
chain sampling under appropriate conditions. 


IV. Simulation Results 

Met-enkephalin has 24 dihedral angles, that according to our 
definition of a space search means that a set of 24 variables 
will be optimized. In the following table, we show our results 
of the best low-energy conformations of Met-enkephalin 
using the proposed global optimization method. 


Tyr 

<?> 

-83.5 

¥ 

155.8 

CO 

-177.2 

Xi 

-173.2 

X 2 

79.4 

x 3 

-166.4 

Gly 

* 

-154.3 

¥ 

86.0 

CO 

168.5 

Gly 

* 

82.9 

¥ 

-75.1 

CO 

-170.0 

Phe 

<t> 

-136.9 

¥ 

19.1 

CO 

-174.1 

Xi 

58.9 

X 2 

94.6 

Met 

<t> 

-163.5 

¥ 

161.2 

CO 

-179.8 

Xi 

52.9 

Xi 

175.3 

X 3 

-179.8 

X A 

58.6 

Energy 

-11.707 


V. Conclusion 


In addition, a Markov chain is said to be ergodic or 
irreducible if it is possible to go from every state to every state. 
Furthermore, the use of a uniform distribution is not the only 
way to achieve randomization. In fact, random walks such as 
Levy flights on a global scale are more efficient. On the other 
hand, the track of chaotic variable can travel ergodically over 
the whole search space. In general, the chaotic variable has 
special characters, i.e., ergodicity, pseudo-randomness and 
irregularity. To enrich the searching behavior and to avoid 


Protein function is related to its structure. In order to predict 
the protein structure computationally, protein must be 
represented in favorable representation. An efficient energy 
function should be used to calculate the protein energy, and 
then a conformational search algorithm must be applied to 
find the lowest free energy conformation. To this end, an 
energy function is used to calculate its energy and a 
conformational search algorithm is used to search the 
conformational search space to find the lowest free energy 
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conformation. Markov chain Monte Carlo is a family of 
simulation methods, which generate samples of Markov 
Chain processes. In this paper, we set up a framework of 
Markov chain sampling to search the protein conformational 
search space. The proposed algorithm was able to find the 
lowest free energy conformation of Met-enkephalin using 
ECEPP/3 force fields. 
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